hysop.numerics.fft.gpyfft_fft module¶
FFT iterface for fast Fourier Transforms using CLFFT backend (using gpyfft).
GpyFFT
GpyFFTPlan
- class hysop.numerics.fft.gpyfft_fft.GpyDCTIIIPlan(cl_env, queue, in_array, out_array, axes, scaling=None, scale_by_size=None, fake_input=None, fake_output=None, callback_kwds=None, direction_forward=True, hardcode_twiddles=False, warn_on_unaligned_output_offset=True, warn_on_allocation=True, error_on_allocation=False, **kwds)[source]¶
Bases:
GpyR2RPlan
Handmade R2R transforms rely on fake input and output that will never really be read or written. This is necessary because clFFT do not handle R2R transforms and we use pre and post processing to compute an equivalent R2C or C2R problem.
Fake arrays are used to compute transform size, batch size and strides. Real arrays pointer are passed to the kernels and pre and post callbacks map the input and output data from those real arrays, adjusting the stride computations to the real array sizes from the fake array indices.
- class hysop.numerics.fft.gpyfft_fft.GpyDCTIIPlan(cl_env, queue, in_array, out_array, axes, scaling=None, scale_by_size=None, fake_input=None, fake_output=None, callback_kwds=None, direction_forward=True, hardcode_twiddles=False, warn_on_unaligned_output_offset=True, warn_on_allocation=True, error_on_allocation=False, **kwds)[source]¶
Bases:
GpyR2RPlan
Handmade R2R transforms rely on fake input and output that will never really be read or written. This is necessary because clFFT do not handle R2R transforms and we use pre and post processing to compute an equivalent R2C or C2R problem.
Fake arrays are used to compute transform size, batch size and strides. Real arrays pointer are passed to the kernels and pre and post callbacks map the input and output data from those real arrays, adjusting the stride computations to the real array sizes from the fake array indices.
- class hysop.numerics.fft.gpyfft_fft.GpyDCTIPlan(cl_env, queue, in_array, out_array, axes, scaling=None, scale_by_size=None, fake_input=None, fake_output=None, callback_kwds=None, direction_forward=True, hardcode_twiddles=False, warn_on_unaligned_output_offset=True, warn_on_allocation=True, error_on_allocation=False, **kwds)[source]¶
Bases:
GpyR2RPlan
Handmade R2R transforms rely on fake input and output that will never really be read or written. This is necessary because clFFT do not handle R2R transforms and we use pre and post processing to compute an equivalent R2C or C2R problem.
Fake arrays are used to compute transform size, batch size and strides. Real arrays pointer are passed to the kernels and pre and post callbacks map the input and output data from those real arrays, adjusting the stride computations to the real array sizes from the fake array indices.
- class hysop.numerics.fft.gpyfft_fft.GpyDSTIIIPlan(cl_env, queue, in_array, out_array, axes, scaling=None, scale_by_size=None, fake_input=None, fake_output=None, callback_kwds=None, direction_forward=True, hardcode_twiddles=False, warn_on_unaligned_output_offset=True, warn_on_allocation=True, error_on_allocation=False, **kwds)[source]¶
Bases:
GpyR2RPlan
Handmade R2R transforms rely on fake input and output that will never really be read or written. This is necessary because clFFT do not handle R2R transforms and we use pre and post processing to compute an equivalent R2C or C2R problem.
Fake arrays are used to compute transform size, batch size and strides. Real arrays pointer are passed to the kernels and pre and post callbacks map the input and output data from those real arrays, adjusting the stride computations to the real array sizes from the fake array indices.
- class hysop.numerics.fft.gpyfft_fft.GpyDSTIIPlan(cl_env, queue, in_array, out_array, axes, scaling=None, scale_by_size=None, fake_input=None, fake_output=None, callback_kwds=None, direction_forward=True, hardcode_twiddles=False, warn_on_unaligned_output_offset=True, warn_on_allocation=True, error_on_allocation=False, **kwds)[source]¶
Bases:
GpyR2RPlan
Handmade R2R transforms rely on fake input and output that will never really be read or written. This is necessary because clFFT do not handle R2R transforms and we use pre and post processing to compute an equivalent R2C or C2R problem.
Fake arrays are used to compute transform size, batch size and strides. Real arrays pointer are passed to the kernels and pre and post callbacks map the input and output data from those real arrays, adjusting the stride computations to the real array sizes from the fake array indices.
- class hysop.numerics.fft.gpyfft_fft.GpyDSTIPlan(cl_env, queue, in_array, out_array, axes, scaling=None, scale_by_size=None, fake_input=None, fake_output=None, callback_kwds=None, direction_forward=True, hardcode_twiddles=False, warn_on_unaligned_output_offset=True, warn_on_allocation=True, error_on_allocation=False, **kwds)[source]¶
Bases:
GpyR2RPlan
Handmade R2R transforms rely on fake input and output that will never really be read or written. This is necessary because clFFT do not handle R2R transforms and we use pre and post processing to compute an equivalent R2C or C2R problem.
Fake arrays are used to compute transform size, batch size and strides. Real arrays pointer are passed to the kernels and pre and post callbacks map the input and output data from those real arrays, adjusting the stride computations to the real array sizes from the fake array indices.
- class hysop.numerics.fft.gpyfft_fft.GpyFFT(cl_env, backend=None, allocator=None, warn_on_allocation=True, warn_on_unaligned_output_offset=True, error_on_allocation=False, **kwds)[source]¶
Bases:
OpenClFFTI
Interface to compute local to process FFT-like transforms using the clFFT backend through the gpyfft python interface.
- clFFT backend has many advantages:
single and double precision supported
no intermediate temporary buffers created at each call.
all required temporary buffers can be supplied or are auto-allocated only once.
real planning capability (but no explicit caching capabilities)
injection of custom opencl code for pre and post processing.
- It also has some disadvantages:
Bad OpenCL CPU devices support
The library is to greedy with temporary buffers for big transforms.
Planning may destroy initial arrays content. Executing a plan may result in unwanted writes to output data, see notes.
- Note that custom code injection is not available for all transforms:
All real to real transforms are implemented using pre and post processing capabilities.
Pre and post processing is used to inject array base offsets.
User should take care of extending previously defined pre and post processing opencl code.
Notes
Output array is used during transform and if out.data is not aligned on device.MEM_BASE_ADDR_ALIGN the begining of the buffer may be overwritten by intermediate transform results.
out.data = out.base_data + out.offset if (offset%alignment > 0)
out.base_data[0:out.size] may be trashed during computation and the result of the transform will go to out.base_data[out.offset:out.offset+out.size]
Thus for every transforms out.base_data[0:min(out.offset,out.size)] may be overwritten with trash data. The default behaviour is to emmit a warning when output data is not aligned on device memory boundary.
Initializes the interface and default supported real and complex types.
- dct(a, out=None, type=2, axis=-1, **kwds)[source]¶
Compute the one-dimensional Cosine Transform of specified type.
- Parameters:
a (array_like) – Real input array.
out (array_like) – Real output array of matching input type and shape.
axis (int, optional) – Axis over witch to compute the transform. Defaults to last axis.
- Return type:
(shape, dtype) of the output array determined from the input array.
- dst(a, out=None, type=2, axis=-1, **kwds)[source]¶
Compute the one-dimensional Sine Transform of specified type.
- Parameters:
a (array_like) – Real input array.
out (array_like) – Real output array of matching input type and shape.
axis (int, optional) – Axis over witch to compute the transform. Defaults to last axis.
- Return type:
(shape, dtype) of the output array determined from the input array.
- fft(a, out=None, axis=-1, **kwds)[source]¶
Compute the unscaled one-dimensional complex to complex discrete Fourier Transform.
- Parameters:
a (array_like of np.complex64 or np.complex128) – Complex input array.
out (array_like of np.complex64 or np.complex128) – Complex output array of the same shape and dtype as the input.
axis (int, optional) – Axis over witch to compute the FFT. Defaults to last axis.
- Return type:
(shape, dtype) of the output array determined from the input array.
Notes
N = a.shape[axis] out[0] will contain the sum of the signal (zero-frequency term always real for real inputs).
- If N is even:
out[1:N/2] contains the positive frequency terms. out[N/2] contains the Nyquist frequency (always real for real inputs). out[N/2+1:] contains the negative frequency terms.
- Else if N is odd:
out[1:(N-1)/2] contains the positive frequency terms. out[(N-1)/2:] contains the negative frequency terms.
- idct(a, out=None, type=2, axis=-1, **kwds)[source]¶
Compute the one-dimensional Inverse Cosine Transform of specified type.
- Default scaling is 1/(2*N) for IDCT type (2,3,4) and
1/(2*N-2) for IDCT type 1.
- Parameters:
a (array_like) – Real input array.
out (array_like) – Real output array of matching input type and shape.
axis (int, optional) – Axis over witch to compute the transform. Defaults to last axis.
- Returns:
(shape, dtype, inverse_type, logical_size) of the output array determined from the input
array.
- idst(a, out=None, type=2, axis=-1, **kwds)[source]¶
Compute the one-dimensional Inverse Sine Transform of specified type.
- Default scaling is 1/(2*N) for IDST type (2,3,4) and
1/(2*N+2) for IDST type 1.
- Parameters:
a (array_like) – Real input array.
out (array_like) – Real output array of matching input type and shape.
axis (int, optional) – Axis over witch to compute the transform. Defaults to last axis.
- Returns:
(shape, dtype, inverse_type, logical_size) of the output array determined from the input
array.
- ifft(a, out=None, axis=-1, **kwds)[source]¶
Compute the one-dimensional complex to complex discrete Fourier Transform scaled by 1/N.
- Parameters:
a (array_like of np.complex64 or np.complex128) – Complex input array.
out (array_like of np.complex64 or np.complex128) – Complex output array of the same shape and dtype as the input.
axis (int, optional) – Axis over witch to compute the FFT. Defaults to last axis.
- Return type:
(shape, dtype, logical_size) of the output array determined from the input array.
- irfft(a, out=None, n=None, axis=-1, **kwds)[source]¶
Compute the one-dimensional hermitian complex to real discrete Fourier Transform scaled by 1/N.
- Parameters:
a (array_like of np.complex64 or np.complex128) – Complex input array.
out (array_like of np.float32 or np.float64) –
Real output array of matching real type. out.shape[…] = a.shape[…] Last axis should match forward transform size used:
out.shape[axis] = 2*(a.shape[axis]-1)
out.shape[axis] = 2*(a.shape[axis]-1) + 1
n (int, optional) – Length of the transformed axis of the output. ie: n should be in [2*(a.shape[axis]-1), 2*(a.shape[axis]-1)+1]
axis (int, optional) – Axis over witch to compute the transform. Defaults to last axis.
Notes
To get an odd number of output points, n or out must be specified.
- Returns:
(shape, dtype, logical_size) of the output array determined from the input array,
out and n.
- rfft(a, out=None, axis=-1, **kwds)[source]¶
Compute the unscaled one-dimensional real to hermitian complex discrete Fourier Transform.
- Parameters:
a (array_like of np.float32 or np.float64) – Real input array.
out (array_like of np.complex64 or np.complex128) – Complex output array of matching complex dtype. out.shape[…] = a.shape[…] out.shape[axis] = a.shape[axis]//2 + 1
axis (int, optional) – Axis over witch to compute the transform. Defaults to last axis.
- Return type:
(shape, dtype) of the output array determined from the input array.
Notes
For real inputs there is no information in the negative frequency components that is not already available from the positive frequency component because of the Hermitian symmetry.
N = out.shape[axis] = a.shape[axis]//2 + 1 out[0] will contain the sum of the signal (zero-frequency term, always real). If N is even:
out[1:N/2] contains the positive frequency terms. out[N/2] contains the Nyquist frequency (always real).
- Else if N is odd:
out[1:(N+1)/2] contains the positive frequency terms.
- class hysop.numerics.fft.gpyfft_fft.GpyFFTPlan(cl_env, queue, in_array, out_array, axes, scaling=None, scale_by_size=None, fake_input=None, fake_output=None, callback_kwds=None, direction_forward=True, hardcode_twiddles=False, warn_on_unaligned_output_offset=True, warn_on_allocation=True, error_on_allocation=False, **kwds)[source]¶
Bases:
OpenClFFTPlanI
Build a clFFT plan using the gpyfft python interface. Emit warnings when transform output has an unaligned buffer offset.
Wrap gpyfft.FFT to allow more versatile callback settings and buffer allocations.
- Parameters:
cl_env (OpenClEnvironment) – OpenCL environment that will provide a context and a default queue.
queue – OpenCL queue that will be used by default.
in_array (cl.Array or OpenClArray) – Real input array for this transform.
out_array (cl.Array or OpenClArray) – Real output array for this transform.
axes (array_like of ints) – Axis over witch to compute the transform.
scaling (float, optional) – Force the scaling of the transform. If not given, no scaling is applied (unlike clfft default behaviour). clFFT default scaling for backward transform can be enabled by passing ‘DEFAULT’ to this parameter.
scale_by_size (int, optional, defaults to 1) – Extra scaling by an integer: 1.0/S will be applied during the post callback. This is equivalent to setting scaling to 1.0/S but the two parameters are not mutually exclusive.
fake_input (DummyArray, optional) – Fake array from which are computed transform shape and strides. Only used by R2R transforms.
fake_output (DummyArray, optional) – Fake array from which are computed transform shape and strides. Only used by R2R transforms.
direction_forward (bool, optional, defaults to True) – The direction of the transform. True <=> forward transform.
hardcode_twiddles (bool, optional, defaults to False) – Hardcode twiddles as a __constant static array of complex directly in the opencl code. Only used by DCT-II, DCT-III, DST-II and DST-III. If set to False, the twiddles will be computed by the device on the fly, freeing device __constant memory banks.
warn_on_unaligned_output_offset (bool, optional, defaults to True) – Emit a warning if the planner encounter an output array that has a non zero offset.
warn_on_allocation (bool, optional, defaults to True) – Emit a warning if the planner has to allocate opencl buffers.
error_on_allocation (bool, optional, defaults to False) – Raise a RuntimeError if the planner has to allocate opencl buffers.
- DEBUG = False¶
- classmethod calculate_transform_strides(taxes, array)[source]¶
Redefine gpyfft.FFT.calculate_transform_strides
- classmethod check_transform_shape(shape)[source]¶
Check that clFFT can handle the logical transform size.
- classmethod compute_input_array_offset(real_input, fake_input, axes, transform_offset='K', idx='k{}', batch_id='b', void_ptr='input', casted_ptr='in')[source]¶
- classmethod compute_output_array_offset(real_output, fake_output, axes, transform_offset='K', idx='k{}', batch_id='b', void_ptr='output', casted_ptr='out')[source]¶
- classmethod compute_pointer_offset(real_array, fake_array, axes, base_offset, transform_offset, idx, batch_id, fp, void_ptr, casted_ptr, is_input)[source]¶
- property context¶
- classmethod fake_array(shape, dtype, strides=None)[source]¶
Create a fake_array of given shape and dtype. If not given, the strides are computed from shape and dtype as if the array would be contiguous in memory.
- classmethod generate_twiddles(name, base, count, typegen, fp, hardcode_twiddles, idx='kx', Tvar='T')[source]¶
Generate twiddles as a string. OpenCl __constant static array: exp(base*k0) for k in 0..count
- classmethod get_array_offset(array, emit_warning)[source]¶
Get array offset in terms of array elements, and emit a warning is offset is non zero and if emit_warning is set.
- property input_array¶
Return currently planned input array.
- property output_array¶
Return currently planned output array.
- post_offset_callback(offset_output_pointer, out_fp, S, **kwds)[source]¶
Default post_offset_callback, just inject output array offset and scale by size (divide by some integer, which will often be the logical size of the transform or 1).
- pre_offset_callback(offset_input_pointer, in_fp, **kwds)[source]¶
Default pre_offset_callback, just inject input array offset.
- pre_offset_callback_C2R(offset_input_pointer, fp, N, **kwds)[source]¶
C2R specific pre_offset_callback, inject input array offset and force the nyquist frequency to be purely real (fixes a bug in clfft for even C2R transform of dimension > 1).
- property queue¶
- property ready¶
- property required_buffer_size¶
Return the required temporary buffer size in bytes to compute the transform.
- class hysop.numerics.fft.gpyfft_fft.GpyR2RPlan(cl_env, queue, in_array, out_array, axes, scaling=None, scale_by_size=None, fake_input=None, fake_output=None, callback_kwds=None, direction_forward=True, hardcode_twiddles=False, warn_on_unaligned_output_offset=True, warn_on_allocation=True, error_on_allocation=False, **kwds)[source]¶
Bases:
GpyFFTPlan
Specialization for real to real transforms built from r2c or c2r transforms. Real to real transforms use fake arrays as input and output along with custom pre and post processing callbacks.
Handmade R2R transforms rely on fake input and output that will never really be read or written. This is necessary because clFFT do not handle R2R transforms and we use pre and post processing to compute an equivalent R2C or C2R problem.
Fake arrays are used to compute transform size, batch size and strides. Real arrays pointer are passed to the kernels and pre and post callbacks map the input and output data from those real arrays, adjusting the stride computations to the real array sizes from the fake array indices.
- exception hysop.numerics.fft.gpyfft_fft.HysopGpyFftWarning[source]¶
Bases:
HysopWarning